Frequency Domain Predictive Modelling with Aggregated Data

نویسندگان

  • Avradeep Bhowmik
  • Joydeep Ghosh
  • Oluwasanmi Koyejo
چکیده

Existing work in spatio-temporal data analysis invariably assumes data available as individual measurements with localised estimates. However, for many applications like econometrics, financial forecasting and climate science, data is often obtained as aggregates. Data aggregation presents severe mathematical challenges to learning and inference, and application of standard techniques is susceptible to ecological fallacy. In this manuscript we investigate the problem of predictive linear modelling in the scenario where data is aggregated in a non-uniform manner across targets and features. We introduce a novel formulation of the problem in the frequency domain, and develop algorithmic techniques that exploit the duality properties of Fourier analysis to bypass the inherent structural challenges of this setting. We provide theoretical guarantees for generalisation error for our estimation procedure and extend our analysis to capture approximation effects arising from aliasing. Finally, we perform empirical evaluation to demonstrate the efficacy of our algorithmic aproach in predictive modelling on synthetic data, and on three real datasets from agricultural studies, ecological surveys and climate science.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Calculation of One-dimensional Forward Modelling of Helicopter-borne Electromagnetic Data and a Sensitivity Matrix Using Fast Hankel Transforms

The helicopter-borne electromagnetic (HEM) frequency-domain exploration method is an airborne electromagnetic (AEM) technique that is widely used for vast and rough areas for resistivity imaging. The vast amount of digitized data flowing from the HEM method requires an efficient and accurate inversion algorithm. Generally, the inverse modelling of HEM data in the first step requires a precise a...

متن کامل

Attenuation of spatial aliasing in CMP domain by non-linear interpolation of seismic data along local slopes

Spatial aliasing is an unwanted side effect that produces artifacts during seismic data processing, imaging and interpolation. It is often caused by insufficient spatial sampling of seismic data and often happens in CMP (Common Mid-Point) gather. To tackle this artifact, several techniques have been developed in time-space domain as well as frequency domain such as frequency-wavenumber, frequen...

متن کامل

Model Learning from Published Aggregated Data

In many application domains, particularly in healthcare, an access for individual datapoints is limited, while data aggregated in form of means and standard deviations are widely available. This limitation is a result of many factors, including privacy laws that prevent clinicians and scientists from freely sharing individual patient data, inability to share proprietary business data, and inade...

متن کامل

Proposing New Methods to Enhance the Low-Resolution Simulated GPR Responses in the Frequency and Wavelet Domains

To date, a number of numerical methods, including the popular Finite-Difference Time Domain (FDTD) technique, have been proposed to simulate Ground-Penetrating Radar (GPR) responses. Despite having a number of advantages, the finite-difference method also has pitfalls such as being very time consuming in simulating the most common case of media with high dielectric permittivity, causing the for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017